IoT Temperature Forecasting

Keywords : IoT Time Series Analysis Pre-processing EDA Bayesian Modeling

1. Overview

Project Detail

This Dataset stores the temperature readings from IoT devices installed outside and inside of an anonymous room. Because the device was in the testing phase, it was uninstalled or shut off several times during the entire reading period, which caused some outliers and missing-values.


Dataset details:

  • id : unique IDs for each reading
  • room_id/id : room id in which device was installed(currently 'admin room' only for example purpose).
  • noted_date : date and time of reading
  • temp : temperature readings
  • out/in : whether reading was taken from device installed inside or outside of room
We can enjoy finding out the following:
  • the relationship of inside and outside temperature
  • trend or seasonality in the data
  • forecasting future temperature by using time-series modeling
  • characteristic tendency through year, month, week or day/night
  • and so on...

Goal of this notebook

  • Practice data cleansing technique
  • Practice EDA technique to deal with time-series data
    • Series Decomposition into trend/seasonality
  • Practice visualising technique
  • Practice time-series modeling technique
    • Prophet

2. Import libraries

3. Load the dataset

Table of Contents

4. Pre-processing

column 'room_id/id' has only one value(Room Admin), so we don't need this column for analysis.

changing column names to understand easily

Datetime information

datetime column has a lot of information such as year, month, weekday and so on. To utilize these information in EDA and modeling phase, we need extract them from datetime column.

Table of Contents

Seasonal information

function to convert month variable into seasons

Table of Contents

Timing information

Table of Contents

Unique identifier defined by id

Duplication

After checking whether any record is duplicated, it turned out that there were duplicate records. So we need to put duplicate records into one unique record.

Table of Contents

Uniqueness of id

In the same datetime(2018-09-12 03:09:00), there are many records and unique ids.

The count of numeric parts in 'id' have the same number as the length of the entire data, so the numeric parts indicate uniqueness of each records.

Adding numeric parts in 'id' as new identifier.

Table of Contents

Gaps in id

There are gaps in 'id' column.

There is a gap in 'date' column when ordered by 'id'.

5. EDA

Univariate Analysis

Monthly Readings

Temperature

Temperature clearly consists of multiple distributions.

Place

Season

Timing

Table of Contents

Multivariate Analysis

Monthly Readings by Place

Temperature Distribution by Place

Temperature by Season

This clearly shows how this sensors data doesn't belong to an European country, thus my initial assumption is wrong

Temperature by Timing

Table of Contents

Time Series Analysis

Pre-processing for time-series analysis

It's straightforward applying time-series analysis with unique time-index data. So we need to calculate mean values by 'date' column and delete 'id' column.

Monthly Temperature Mean

Daily Temperature Mean

Weekday Temperature Mean

WeekofYear Temperature Mean

Table of Contents

Missing data

Table of Contents

6. Modelling

Data Preparation

In addition to temperature information, I added season information, which is a time-series factor that affects temperature (especially outside).

Build Model & Predict Future Temperature

Table of Contents

Table of Contents

Table of Contents

7. Conclusion

variance of temp for inside - outside room temp ?

Predict the next scenario?

Table of Contents

8. References